Everything under the hood — providers and models, the local engine, memory, tooling, storage, and where your data actually lives. Reference-grade detail, kept off the front page on purpose.
Matches Valence v1.0.1 · Windows desktopA full model runs on your machine out of the box — no account, no key, no internet.
llama-server sidecar built from llama.cpp.1 Local models use the GGUF format (the file format llama.cpp loads). Most open-source models are published in GGUF or can be converted to it. ↩
Bring your own key (BYOK) for any cloud provider — your keys are stored encrypted on your machine and calls go straight to the provider, never through Helix.
| Provider | Representative models | Type |
|---|---|---|
| Local (bundled) Gemma 4 E2B-it · llama-server | The fastest path to a fully offline conversation. | Local |
| Your own local models most popular open models | Load models you've downloaded yourself and run them on the built-in engine — fully offline. | Local |
| Ollama your Ollama library | Already use Ollama? Connect it and use those models too — air-gapped. | Local |
| OpenAI GPT-4.1 · GPT-4o · o-series | Fast generalist coverage for writing, coding, multimodal. | BYOK |
| Anthropic Claude Opus 4 · Sonnet 4 · 3.5 | Strong analysis, careful reasoning, review-heavy work. | BYOK |
| Google Gemini 2.5 / 3 Pro · Flash | Fast synthesis and broad coverage for research. | BYOK |
| xAI Grok 4 | Frontier general model. | BYOK |
| Azure OpenAI enterprise tier | OpenAI models under your Azure tenancy. | BYOK |
| Vertex AI Google enterprise | Gemini under Google Cloud governance. | BYOK |
| OpenRouter 100+ aggregated | One key, the long tail of models. | BYOK |
@-mention any configured model mid-thread; context carries across the switch.Persistent, on-device recall — the model keeps useful context across sessions without anything leaving your disk.
#remember{ } — e.g. #remember{user prefers Python}. Saved to local memory.#recall{ } searches your memory for relevant facts and feeds them back into the conversation.Documents\AIOverlay\LocalMemory\) — per profile. Never transmitted.Give the model a real action layer — search the web, read your files, query databases — through the Model Context Protocol.
@modelcontextprotocol/server-filesystemserver-brave-search (web search)server-memory#remember, #recall) work the same way — invoked inline during conversation.Each profile carries its own model, system prompt, preferences and memory — and what one knows, the next can't see.
profiles/<name>/). Your work profile and your home profile never share context.You can open, read and audit everything Valence stores — it's all on your machine.
%AppData%\AIOverlayV2.1\settings.jsonDocuments\AIOverlay (configurable) — stored as JSON with full message metadata.